Search CORE

28 research outputs found

Volume Comparison with Integral Bounds in Lorentz Manifolds

Author: Aragam Nikhyl Bryon
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2008
Field of study

University of Tennessee, Knoxville: Trace

Learning Large-Scale Bayesian Networks with the sparsebn Package

Author: Aragam Bryon
Gu Jiaying
Zhou Qing
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 10/03/2018
Field of study

Learning graphical models from data is an important problem with wide applications, ranging from genomics to the social sciences. Nowadays datasets often have upwards of thousands---sometimes tens or hundreds of thousands---of variables and far fewer samples. To meet this challenge, we have developed a new R package called sparsebn for learning the structure of large, sparse graphical models with a focus on Bayesian networks. While there are many existing software packages for this task, this package focuses on the unique setting of learning large networks from high-dimensional data, possibly with interventions. As such, the methods provided place a premium on scalability and consistency in a high-dimensional setting. Furthermore, in the presence of interventions, the methods implemented here achieve the goal of learning a causal network from data. Additionally, the sparsebn package is fully compatible with existing software packages for network analysis.Comment: To appear in the Journal of Statistical Software, 39 pages, 7 figure

arXiv.org e-Print Archive

Journal of Statistical Software

Learning nonparametric latent causal graphs with unknown interventions

Author: Aragam Bryon
Jiang Yibo
Publication venue
Publication date: 03/11/2023
Field of study

We establish conditions under which latent causal graphs are nonparametrically identifiable and can be reconstructed from unknown interventions in the latent space. Our primary focus is the identification of the latent structure in measurement models without parametric assumptions such as linearity or Gaussianity. Moreover, we do not assume the number of hidden variables is known, and we show that at most one unknown intervention per hidden variable is needed. This extends a recent line of work on learning causal representations from observations and interventions. The proofs are constructive and introduce two new graphical concepts -- imaginary subsets and isolated edges -- that may be useful in their own right. As a matter of independent interest, the proofs also involve a novel characterization of the limits of edge orientations within the equivalence class of DAGs induced by unknown interventions. These are the first results to characterize the conditions under which causal representations are identifiable without making any parametric assumptions in a general setting with unknown interventions and without faithfulness.Comment: To appear at NeurIPS 202

arXiv.org e-Print Archive

A super-polynomial lower bound for learning nonparametric mixtures

Author: Aragam Bryon
Tai Wai Ming
Publication venue
Publication date: 28/03/2022
Field of study

We study the problem of learning nonparametric distributions in a finite mixture, and establish a super-polynomial lower bound on the sample complexity of learning the component distributions in such models. Namely, we are given i.i.d. samples from

f

where

f=\sum_{i=1}^k w_i f_i, \quad\sum_{i=1}^k w_i=1, \quad w_i>0

and we are interested in learning each component

f_i

. Without any assumptions on

f_i

, this problem is ill-posed. In order to identify the components

f_i

, we assume that each

f_i

can be written as a convolution of a Gaussian and a compactly supported density

\nu_i

with

\text{supp}(\nu_i)\cap \text{supp}(\nu_j)=\emptyset

. Our main result shows that

\Omega((\frac{1}{\varepsilon})^{C\log\log \frac{1}{\varepsilon}})

samples are required for estimating each

f_i

. The proof relies on a fast rate for approximation with Gaussians, which may be of independent interest. This result has important implications for the hardness of learning more general nonparametric latent variable models that arise in machine learning applications

arXiv.org e-Print Archive

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Author: Aragam Bryon
Bello Kevin
Ravikumar Pradeep
Publication venue
Publication date: 16/09/2022
Field of study

The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles. In this work, we propose a

\textit{fundamentally different}

acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. To deal with the inherent asymmetries of a DAG, we relate the domain of our log-det characterization to the set of

\textit{M-matrices}

, which is a key difference to the classical log-det function defined over the cone of positive definite matrices. Similar to acyclicity functions previously proposed, our characterization is also exact and differentiable. However, when compared to existing characterizations, our log-det function: (1) Is better at detecting large cycles; (2) Has better-behaved gradients; and (3) Its runtime is in practice about an order of magnitude faster. From the optimization side, we drop the typically used augmented Lagrangian scheme, and propose DAGMA (

\textit{Directed Acyclic Graphs via M-matrices for Acyclicity}

), a method that resembles the central path for barrier methods. Each point in the central path of DAGMA is a solution to an unconstrained problem regularized by our log-det function, then we show that at the limit of the central path the solution is guaranteed to be a DAG. Finally, we provide extensive experiments for

\textit{linear}

and

\textit{nonlinear}

SEMs, and show that our approach can reach large speed-ups and smaller structural Hamming distances against state-of-the-art methods.Comment: To appear at NeurIPS 202

arXiv.org e-Print Archive